skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zheng, Z"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Spatial sound reasoning is a fundamental human skill, enabling us to navigate and interpret our surroundings based on sound. In this paper we present BAT, which combines the spatial sound perception ability of a binaural acoustic scene analysis model with the natural language reasoning capabilities of a large language model (LLM) to replicate this innate ability. To address the lack of existing datasets of in-the-wild spatial sounds, we synthesized a binaural audio dataset using AudioSet and SoundSpaces 2.0. Next, we developed SpatialSoundQA, a spatial sound-based question-answering dataset, offering a range of QA tasks that train BAT in various aspects of spatial sound perception and reasoning. The acoustic front end encoder of BAT is a novel spatial audio encoder named Spatial Audio Spectrogram Transformer, or Spatial-AST, which by itself achieves strong performance across sound event detection, spatial localization, and distance estimation. By integrating Spatial-AST with LLaMA-2 7B model, BAT transcends standard Sound Event Localization and Detection (SELD) tasks, enabling the model to reason about the relationships between the sounds in its environment. Our experiments demonstrate BAT's superior performance on both spatial sound perception and reasoning, showcasing the immense potential of LLMs in navigating and interpreting complex spatial audio environments. 
    more » « less
    Free, publicly-accessible full text available May 17, 2026
  2. Abstract We measure the projected two-point correlation functions of emission-line galaxies (ELGs) from the Dark Energy Spectroscopic Instrument One-Percent Survey and model their dependence on stellar mass and [OII] luminosity. We select ∼180,000 ELGs with redshifts of 0.8 < z < 1.6, and define 27 samples according to cuts in redshift and both galaxy properties. Following a framework that describes the conditional [OII] luminosity–stellar mass distribution as a function of halo mass, we simultaneously model the clustering measurements of all samples at fixed redshift. Based on the modeling result, most ELGs in our samples are classified as central galaxies, residing in halos of a narrow mass range with a typical median of ∼1012.2−12.4h−1M. We observe a weak dependence of clustering amplitude on stellar mass, which is reflected in the model constraints and is likely a consequence of the 0.5 dex measurement uncertainty in the stellar mass estimates. The model shows a trend between galaxy bias and [OII] luminosity at high redshift (1.2 < z < 1.6) that is otherwise absent at lower redshifts. 
    more » « less
    Free, publicly-accessible full text available October 9, 2026